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Abstract 

The present paper discusses the limitations of Classical 
Test Theory, the purpose of Item Response Theory/Latent Trait 
Measurement models, and the step-by-step calculations in the 
Rasch measurement model. The present paper explains how IRT 
transforms person abilities and item difficulties into the 
same metric for test -independent and sample -independent 
comparisons . 
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Item Response Theory: Understanding the One-Parameter Rasch 

Model 

Item Response Theory (IRT) or Latent Trait Theory came 
about due to the limitations of classical measurement models. 
Classical measurement defines person ability, also known as 
the true score, as the expected value of performance on a 
test. The problem with the classical definition is that 
ability estimate depends on the difficulty of the items chosen 
for the test. In other words, the ability estimate or score 
is test dependent . Likewise, item difficulty- -defined by the 
classical theory as the proportion of examinees answering the 
item correctly-- depends on the ability of the particular 
people taking the test. In other words, item difficulties are 
group dependent (Hambleton & Swamination, 1985) . Therefore, 
items and examinees on different tests are measured on 
different scales. Because classical theory item difficulties 
and person abilities are on different scales, it is 
inappropriate to compare them (Wright & Stone, 1979) . Item 
Response Theory, on the other hand, transforms item difficulty 
and person ability estimates into statistics on a single 
comparable scale that are also respectively “person-free” and 

“item-free.” “Person-free” means that the item difficulty 
calibrations are theoretically independent of the persons 
generating the calibrations; “item- free” means that the person 

ability estimates are theoretically independent of the items 
used on the measurement . 
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IRT is based on two postulates. First, the performance 
of an examinee on a test item can be predicted by a set of 
factors called traits, latent traits, or abilities. Second, 
the relationship between the examinees item performance and 
the set of traits underlying the performance can be defined by 
an item characteristic curve (Hambleton & Swaminathan, 1985) . 
Regardless of group membership, as the level of ability 
increases, the probability of a correct response to an item 
increases (Hambleton & Cook, 1977) . 

There are three models in Item Response Theory. Figure 1 
presents the three-parameter model, made up of the item 
discrimination “a” parameter, the item difficulty “b” 

parameter, and the guessing “c” parameter (Warm, 1978) . The 

item discrimination parameter indicates the slope of the item 
characteristic curve. The item difficulty parameter indicates 
the location on the ability (d) axis where the probability for 
answering correctly is .50. The guessing parameter is the 
probability that a correct response occurs solely by chance. 

INSERT FIGURE 1 ABOUT HERE 



Figure 2 presents the two-parameter model . Notice that 
the item characteristic curves are asymptotic to zero, 
considering the guessing parameter negligible. The one- 
parameter or Rasch model is presented in Figure 3 . In the 
Rasch model both the guessing and item discrimination 
parameters are considered negligible (Hambleton & Swamination, 
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1985) . The present paper will focus on the calculations 
involved in the Rasch or one-parameter model. 



INSERT FIGURES 2 AND 3 ABOUT HERE 



The purpose of the Rasch model is to analyze differences 
in test scores that initially are not linear (Wright & Stone, 
1979) . To analyze these differences, data must be transformed 
into measures that are approximately linear. To achieve 
approximate linearity, probabilities are converted into 
logits, as presented in Table 1. Figure 4 presents a graph of 
the initial probabilities and a graph of the logit 
transformations of the probabilities. The transformation of 
probabilities to logits allows researchers to compare item 
difficulties and person abilities across tests (Warm, 1978) . 

INSERT TABLE 1 AND FIGURE 4 ABOUT HERE 



The Rasch Model begins with a matrix of all items by 
persons, as presented in Table 2. Rows are persons, while 
columns are items. Within the matrix, a 1 denotes a correct 
response, while a 0 denotes an incorrect response. The final 
column presents the proportion of correct responses to the 
total number of responses for each person, while the final row 
presents the proportion of correct response to the total 
number of responses for each item. Next, as seen in Table 3, 
the people and items with all correct or incorrect responses 
are removed. No estimation can be obtained for these persons 
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and items, because the data contain no information about these 
item difficulties or person abilities. Notice in Table 3 that 
items 18, 1, 2, and 3 and persons 35 and 36 are not included. 
Item 18 contains no information because no person answered it 
correctly; therefore, it would be impossible to estimate how 
difficult the item really is given only these data. Likewise, 
person 35 answered no items correctly, leaving no way to 
assess with the available data the ability for this person. 
When item 18 was omitted, person 36 was left with a perfect 
score. Therefore, person 36 had to be eliminated. Person 35 
was omitted for missing all the items, leaving items 1, 2, and 
3 with all correct responses. Therefore, these items had to be 
eliminated also. After eliminating these items and persons, 
new proportions are calculated using the remaining 34 people 
and 14 items displayed in Table 3 . 

INSERT TABLES 2 AND 3 ABOUT HERE 



The next step in the Rasch model is to calibrate the 
initial item difficulties, as presented in Table 4. Item 
scores are listed in descending order by the number of correct 
responses and then by the frequency of their occurrence. Then 
the proportions are converted into logits . Logits are 
calculated by taking the natural log of the ratio of the 
proportion incorrect divided by the proportion correct . Once 
the proportions are transformed into logits the mean and 
variance for each distribution is computed. The mean (Avg) is 
then used to center the item logits at zero and the variance 
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(U) will be used in computing final calibrations. Notice that 
logits (d) are no longer bounded by zero and one, but have 
been transformed to a new scale that is infinite in both 
directions and is approximately linear to the underlying 
variable . 



INSERT TABLE 4 ABOUT HERE 



Once the initial item difficulties are calibrated, the 
initial person abilities are calibrated, as presented in Table 
5. First, the possible correct answers for items are listed in 
ascending order, along with their associated frequencies. 

Then, the natural log of the proportion of successes is 
divided by the proportion of failures to convert the 
proportions into logits. The mean (ydot) and variance (V) are 
then calculated. The variance will be used in calculating the 
expansion factors for final calibrations. 



INSERT TABLE 5 ABOUT HERE 



To reach the final estimates for item difficulties and 
person abilities expansion factors are applied to the original 
estimates. The purpose of the expansion factor is to remove 
the effect of sample spread and test width to give final 
estimates that are neither person dependent or item dependent. 
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The formula for the expansion factor for person abilities due 
to test width is: SQRT { (1+ (U/2 . 89) ) / (1- { (U*V) /8 . 35) ) ) , where U 
from Table 4 is the variance for item difficulties and V from 
Table 5 is the variance for person abilities . The formula for 
the expansion factor due to sample spread is: 

SQRT{ (1+ (V/2 . 89) ) / (1- { (V*U) /8 .35) ) ) . In Table 6 the sample 
spread expansion factor is multiplied by the initial item 
calibration to yield the corrected item calibration. Likewise, 
in Table 7 the test width expansion factor is applied to the 
initial person measure to yield the corrected or final person 
ability measure. 



INSERT TABLES 6 AND 7 ABOUT HERE 



The Rasch model does not end with the final estimates of 
item difficulty and person ability. The fit of the model to 
the data must be evaluated (Hambleton & Cook, 1977) , and not 
simply assumed. This is done by observing the differences 
between estimates of ability and difficulty for each person 
and item. Table 8 is a matrix of the responses of the 34 
persons to 14 items. The last row presents the item 
difficulties, while the last column presents the person 
abilities. The double line in the table represents the point 
where person ability equals item difficulty. In theory, all 
the responses to the left or below the double line should be 
correct. Likewise, all the responses to the right or above 
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the line should be incorrect. Answers not fitting with the 
theory are considered aberrations. In Table 8 these 
aberrations are underlined. For example, person four has two 
aberrant responses: item 4 and item 7. Item 14 has three 
aberrant responses: persons 23, 34, and 15. 

INSERT TABLE 8 ABOUT HERE 



Once the aberrations are identified, a fit analysis is 
computed for individual persons and items. Table 9 is an 
example of a fit analysis for person 19. The line between 
item 10 and item 11 represents the point where the person 
ability, 0.357, is equal to item difficulty, between .0375 and 
1.174. According to the model, everything to the left of the 
line should be correct, denoted 1. Everything to the right of 
the line should be incorrect, denoted 0. There are four 
responses that do not fit the model: items 6, 9, 10, and 13. 
These aberrations are underlined. To compute the fit analysis 
the difference between the person ability and each item 

difficulty is first calculated. Next, a is calculated for 
each aberrant item using the formula: = exp|b-d| . The 

variance (V) is then calculated by dividing the sum of the z^ 
values by the number of items minus one (v-1) . The variance is 
used to calculate a t-statistic using the formula: t(df=v-l)= 

( (In (V) ) = (V-1) ) ) * ( ( (v-1) /8) ** . 5) . For example, the calculated 
t-value for person 19 is 2.24, compared with the critical t- 
value at alpha=0.05 which is 2.160. Therefore, person 19 is 
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not consistent with the model and should be removed from the 
data . 

INSERT TABLE 9 ABOUT HERE 



Not only is the fit analysis calculated for persons, but 
it is also calculated for items. Table 10 is an example of a 
fit analysis for item 8. The line between person 27 and 
person 11 represents the point where item difficulty, 

1.836, is equal to person ability, between —1.973 and —1.266. 

Theoretically, everything above the line should be incorrect, 
while everything below should be correct . There are eight 
responses that are aberrant: persons 33, 27, 11, 12, 9, 29, 

31, and 34. A is then calculated for each aberrant person 
using the same formula that was used for person fit analysis. 

The z2 values are summed and divided by the number of people 
minus one (n-1) to calculate the variance. A t-statistic is 
then calculated using the formula: t((jf=n-l)= ((ln(V)) = (V- 

1) ) ) * ( ( (n-1) /8) ** . 5) . For example, the calculated t-value for 
item 8 is 3.65 compared with the critical t-value at 
alpha=0.05 which is 2.042. Therefore, item 8 is not 
consistent with the model and should be removed from the data. 
In fact, all items and persons found to be statistically 
significant are removed from the data and the entire analysis 
is repeated from the remaining score distributions until no 
items or persons are statistically significant. 



INSERT TABLE 9 ABOUT HERE 
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To test whether the final calibrations are truly group 
independent, researchers may choose to do a cross validation. 
By tradition, this is typically done by dividing persons in a 
large sample with a large spread into six ability groupings. 
Item calibrations are then computed separately for each group. 
If the item calibrations for the total sample are similar to 
the six separate sets of item calibrations, then there is 
evidence that the final calibrations are sample independent. 

The group -dependence and test-dependence of the classical 
measurement models have limited the appropriateness of 
comparing items and persons across tests. But, with IRT and 
the Rasch Model, item difficulties and person abilities can 
now be compared linearly, free of group and test dependence, 
if the IRT model fits the data. 

However, Lawson (1991) has raised concerns about how 
often this occurs. Lawson (1991) analyzed the differences 
between the classical measurement model and the Rasch model to 
evaluate the benefits of using the Rasch model. The analysis 
revealed that both procedures, classical and Rasch yielded 
almost perfectly correlated results as regards to both person 
abilities and item difficulties. These similarities are 
obscured only because IRT models express both person abilities 
and item difficulties in logits, which are units unfamiliar to 
some people . 
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In the present paper, to analyze the differences between 
the Rasch calibrations and the classical measurement 
calibrations, a regression analysis was performed to find both 
a correlation between the two measures and to plot a 
scatterplot of the two measures with their regression line . 
Table 10 presents the item probabilities from Table 3 and the 
item difficulties from Table 6. The probabilities and 
difficulties were correlated using a regression analysis which 
revealed an r = -.985. This supports Lawson's analysis that 
the two sets of statistics are very highly correlated. 

INSERT TABLE 11 ABOUT HERE 



Table 11 presents the number of items correct from Table 
3 and the person abilities from Table 7. The number correct 
and the person abilities were correlated using a regression 
analysis revealing an r = .997. Again, the two sets of 
statistics are very highly correlated. Figures 5 and 6 present 
scatterplots of the item probabilities and item difficulties 
and the number correct and the person abilities and their 
associated regression lines. Again, this confirms Lawson's 
claim that the two sets of statistics are almost identical. 

INSERT TABLE 12 AND FIGURES 5 AND 
6 ABOUT HERE 

The results from Lawson's chapter and the present paper 
challenge the idea that Rasch latent trait measurement is 
superior to classical measurement because its estimates are 
item- free and sample -free. The high correlations between the 
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two measures can be explained by only one of two 
possibilities: (1) the calibrations in the Rasch model are not 

truly item-free and sample-free, or (2) the calibrations in 
the classical measurement model are also item- free and sample- 
free . Though Rasch model procedures may superior in other ways 
(Lawson, 1991) , the superiority does not arise from unique 
item- free and sample- free calibrations. 
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29 


7 


0 


1 




1.836 






31 


7 


0 


0 


1.836 




1.836 


6.271 


10 


7 


0 


0 


1.836 




1.836 


6.271 


18 


7 


0 


1 




1.836 






14 


7 


0 


1 




1.836 






32 


8 


0.619 


1 




2.455 






20 


9 


1.266 


1 




3.102 






21 


9 


1.266 


1 




3.102 






22 


10 


1.973 


1 




3.809 






23 


10 


1.973 


1 




3.809 






34 


10 


1.973 


0 


3.809 




3.809 


45.11 


15 


11 


2.797 


1 




4.633 






7 


12 


3.858 


1 




5.694 






24 


12 


3.858 


1 




5.694 
















SOS 




68.32 


SOS 


/ n-1 
















68.32 


33 














2.07 















t(df=n-l) = ((ln(V)) + (V-D) * ( ( (n-1) / 8) **.5) 



2.07 


2.07 


1 


34 


1 


8 


** .5 


0.728 


2.07 


1 


34 


1 


8 


** .5 


0.728 


1.07 




34 


1 


8 


** .5 




1.798 




34 


1 


8 


** .5 




1.798 




33 




8 


**.5 


t = 


1.798 

1.798 


3 .652 




4 . 125 
2.031 




**.5 




3 



3 
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Table 11. Item probabilities and difficulties 



Item 


Probability 


Difficulty 


4 


0 . 941176471 


-4.415 


5 


0 . 941176471 


-4.415 


7 


0.882352941 


-3.299 


6 


0.764705882 


-2 . 067 


9 


0.764705882 


-2 . 067 


8 


0.735294118 


-1 . 836 


10 


0.676470588 


-1.418 


11 


0.382352941 


0 .375 


13 


0.264705882 


1 . 174 


12 


0.176470588 


1 . 938 


14 


0.117647059 


2 . 637 


15 


0.058823529 


3 . 753 


16 


0.029411765 


4 . 820 


17 


0 . 029411765 


4.820 
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Table 12. Number of items correct and person abilities. 



Person 


#correct 


Ability 


25 


2 


-3 . 858 


4 


2 


-3 . 858 


33 


3 


-2 . 797 


1 


3 


-2 . 797 


27 


4 


-1 . 973 


11 


5 


-1.266 


12 


5 


-1.266 


17 


5 


-1.266 


19 


5 


-1.266 


30 


6 


-0 . 619 


2 


6 


-0 . 619 


3 


6 


-0 . 619 


5 


6 


-0 . 619 


6 


6 


-0 . 619 


8 


6 


-0 . 619 
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Figure Captions 

Figure 1 . Three-parameter item characteristic curves. 
Figure 2 . Two-parameter item characteristic curves. 

Figure 3 . One-parameter item characteristic curves. 

Figure 4 . Graph of probability proportions and probability 
proportions transformed into logits. 

Figure 5 . Scatterplot of item probabilities with item 
difficulties . 

Figure 6 . Scatterplot of number correct with person 
abilities . 
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Item 1 



Item 2 

Items 

Item 4 

_ . . Items 
Items 



Ability 



Figure 1 . Three -parameter item characteristic curves . 
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Figure 2 , Two-parameter item characteristic curves. 
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Figure 3 . One-parameter item characteristic curves. 
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Figure 4 . Graph of probability proportions and probability 
proportions transformed into logits. 
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Difficulty 



Figure 5 . Scatterplot of item probabilities with item 
difficulties . 
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Figure 6 . Scatterplot of number correct with person 
abilities . 
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